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Abstract 

Geometric  groundtruth  at  the  character,  word,  and  line  levels  is  crucial  for  developing 
and  evaluating  optical  character  recognition  (OCR)  algorithms.  Kanungo  and  Haral- 
ick  proposed  a  closed-loop  methodology  for  generating  character-level  groundtruth  for 
rescanned  images.  In  this  paper,  we  present  a  robust  version  of  their  methodology.  We 
grouped  the  feature  points  and  used  a  feature  point  registration  algorithm  on  the  grouped 
feature  point  set  to  estimate  the  transformation.  The  Euclidean  distance  between  charac¬ 
ter  centroids  was  used  as  the  error  metric.  We  performed  experiments  on  the  University 
of  Washington  data  set. 
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Abstract 

Geometric  gronndtrnth  at  the  character,  word,  and  line  levels  is  crncial  for  developing 
and  evalnating  optical  character  recognition  (OCR)  algorithms.  Kannngo  and  Haral- 
ick  proposed  a  closed-loop  methodology  for  generating  character-level  gronndtrnth  for 
rescanned  images.  In  this  paper,  we  present  a  robnst  version  of  their  methodology.  We 
gronped  the  featnre  points  and  nsed  a  featnre  point  registration  algorithm  on  the  gronped 
featnre  point  set  to  estimate  the  transformation.  The  Enclidean  distance  between  charac¬ 
ter  centroids  was  nsed  as  the  error  metric.  We  performed  experiments  on  the  University 
of  Washington  data  set. 
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1  Introduction 


Character,  word,  and  line-level  geometric  gronndtrnth  is  crncial  for  optical  character 
recognition  (OCR)  algorithm  development  and  evalnation.  Snch  gronndtrnth  is  typically 
created  mannally  and  therefore  its  creation  is  time-consnming,  expensive,  and  prone  to 
hnman  errors. 

Consider  a  case  in  which  researchers  already  have  geometric  gronndtrnth  for  a  small 
set  of  document  images  bnt  wonld  like  to  nse  these  docnment-gronndtrnth  pairs  to  boot¬ 
strap  the  constrnction  of  a  larger  (more  varied)  data  set.  Two  scenarios  are  possible.  In 
the  hrst  scenario,  the  gronndtrnth  for  the  set  of  original  real  docnment  images  is  created 
mannally,  and  in  the  second  scenario,  the  gronndtrnth  for  the  set  of  original  synthetic 
docnment  images  is  generated  antomatically.  In  both  cases  the  algorithm  developer 
wonld  like  to  print,  photocopy,  fax  and  rescan  the  original  docnment  images  and  then 
antomatically  generate  the  geometric  gronndtrnth  for  the  rescanned  docnments. 

In  this  paper,  we  present  a  point  matching  based  algorithm  to  antomatically  generate 
the  gronndtrnth  for  rescanned  images.  The  algorithm  extracts  featnre  points  from  the 
original  and  rescanned  images  and  then  registers  the  two  images  nsing  a  point  matching 
algorithm.  The  gronndtrnth  for  the  rescanned  images  is  then  generated  by  transforming 
the  gronndtrnth  of  the  original  images. 

In  Chapter  2,  related  research  is  snmmarized.  The  antomatic  gronndtrnth  generation 
methodology  is  ontlined  in  Chapter  3,  and  the  matching  algorithms  are  discnssed  in 
Chapter  4.  We  discnss  the  impact  of  image  pattern  complexity  on  image  registration 
in  Chapter  5.  The  error  metric  and  experimental  protocol  for  condncting  controlled 
experiments  are  discnssed  in  Chapter  7.  Experimental  resnlts  are  presented  in  Chapter  8. 
In  Chapter  9,  image  registration  is  nsed  for  generating  gronndtrnth  for  microhlmed  and 
faxed  images.  Finally,  in  Chapter  10,  we  provide  onr  conclnsions. 

Part  of  the  work  presented  in  this  paper  appeared  in  DAS2000  [15]. 

2  Previous  Work 

Kannngo  and  Haralick  [13,  14]  proposed  a  methodology  for  antomatically  generating 
the  gronndtrnth  of  a  rescanned  image  by  estimating  the  transformation  between  two 
images  and  then  transforming  the  gronndtrnth  nsing  the  estimated  transformation.  They 
estimated  the  transformation  from  corresponding  pairs  of  featnre  points.  Fonr  corner 
points  of  the  images  were  nsed  as  featnre  points  to  estimate  the  transformation.  The 
point  matching  registration  algorithm  was  then  improved  by  nsing  a  robnst  local  template 
matching  algorithm.  However,  their  method  is  not  robnst  when  part  of  the  image  is 
missing  or  there  are  extra  featnre  points  in  the  image.  This  drawback  can  be  overcome  by 
nsing  all  the  available  featnre  points.  Hobby  [10]  improved  the  registration  by  considering 
all  featnre  points.  He  nsed  a  direct  search  optimization  method  to  minimize  the  mismatch 
in  the  estimated  transformation.  However,  his  method  hnds  a  local  minimum  instead  of 
a  global  minimnm.  More  recently,  Viard-Gandin  et  ah  [25]  proposed  a  methodology  for 
creating  gronndtrnth  for  handwritten  docnments.  They  designed  a  database  of  online 
and  offline  handwritten  data.  They  mannally  determined  corresponding  points  in  the 
online  and  offline  domain  and  then  estimated  the  affine  transformation  between  the  two 


'  Generate  input  image' ' 


Figure  1:  The  automatic  closed-loop  methodology  of  Kanungo  and  Haralick. 


coordinate  systems. 

Numerous  feature  point  matching  algorithms  have  been  reported  in  the  literature. 
Baird  [1]  used  feature  points  to  do  image  matching.  Breuel  [3]  also  proposed  an  algo¬ 
rithm  for  feature  point  matching.  He  estimated  the  transformation  by  subdividing  the 
transformation  space.  Huttenlocher  et  al.  [ff,  12]  used  a  branch-and-bound  algorithm 
using  Hausdorff  distance  as  the  distance  measure.  They  used  the  distance  transform  to 
determine  nearest  neighbors.  Mount  et  al.  [20]  proposed  a  modihed  branch-and-bound 
algorithm  based  on  partial  Hausdorff  distance.  They  used  kd-tree-based  nearest  neigh¬ 
bor  searching  to  hnd  correspondences.  These  algorithms  are  discussed  in  more  detail  in 
Section  4.2. 

3  The  Automatic  Groundtruthing  Methodology 

Given  an  image  and  its  groundtruth  information,  we  wish  to  generate  groundtruth  for 
an  image  which  is  a  transformed  (scanned,  photocopied,  microhlmed,  faxed,  etc.)  ver¬ 
sion  of  the  original  image.  The  basic  idea  is  to  estimate  the  transformation  between 
the  two  images  and  then  transform  the  groundtruth  information  using  the  estimated 
transformation. 

Figure  f  illustrates  the  methodology  that  Kanungo  and  Haralick  [14]  used  for  gen¬ 
erating  groundtruth  information  for  real  images.  Four  corner  points  of  the  images  were 
used  as  feature  points  to  estimate  the  transformation.  The  four  feature  points,  Piip2-,P3 
and  p4  were  determined  by  the  following  equations: 

Pi  =  argmin(x(ai)  +  ?/(ai)). 
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Figure  2:  Local  template  matching. 


P2  =  argmax(x(&i)  -  y{hi)), 

b, 

P3  =  argmin(x(ci)  +  y{ci)), 

Ci 

P4  =  argmax(x(Ji)  -  y{di)), 

d, 

where  h^Ci  and  di  are  respectively  the  upper-left,  upper-right,  lower-right,  and  lower- 
left  corners  of  the  bounding  boxes  of  each  connected  component  in  the  image.  More 
improvement  is  achieved  by  applying  a  local  template  matching  algorithm  described  in 
Figure  2.  The  dashed  rectangle  in  Figure  1  is  the  module  that  is  being  replaced  by  the 
algorithm  described  in  this  paper. 

First  we  extract  the  connected  components  of  the  original  and  transformed  images. 
The  number  of  connected  components  in  a  typical  document  image  is  1000-5000,  which 
makes  the  running  time  of  the  estimation  procedure  too  large.  To  reduce  the  complexity 
of  the  problem,  we  group  the  connected  components.  The  groups  are  approximately  at 
the  word  level.  As  a  result  of  grouping,  the  number  of  feature  points  to  be  considered 
is  reduced  to  about  20-25%  of  its  original  size.  We  explain  the  feature  point  grouping 
procedure  in  Section  4.1. 

Using  the  two  feature  point  sets,  one  from  the  original  image  and  the  other  from  the 
transformed  image,  we  estimate  the  transformation  by  using  the  feature  point  registration 
algorithms  described  in  Section  4.2.  Figure  3  shows  an  illustration  of  this  procedure. 

4  The  Matching  Algorithm 

We  need  to  hnd  the  correspondence  and  the  transformation  between  two  point  sets. 
There  are  two  major  steps  in  the  matching  procedure:  (i)  feature  point  grouping  and  (ii) 
feature  point  registration. 
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Figure  3:  The  automatic  registration  methodology. 


4.1  Feature  point  grouping 


To  reduce  the  size  of  the  problem,  we  group  connected  components  at  the  word  token 
level.  Let  B  be  the  set  of  bounding  boxes,  N N^(h)  be  the  k  nearest  neighbors  of  bounding 
box  &,  PQ  be  a  priority  queue,  r  be  a  threshold,  and  rooi(h)  be  the  root  of  &,  which  is 
initialized  to  be  b.  The  key  of  the  priority  queue  is  the  distance  between  the  two  bounding 
boxes.  Bounding  boxes  with  the  smallest  distance  appear  on  top  of  the  queue.  In  selecting 
the  threshold,  we  used  the  threshold  selection  method  of  Kittler  and  Illingworth  [16]. 

The  thresholding  works  as  follows.  Assume  that  the  observations  come  from  a  mixture 
of  two  Gaussian  distributions  having  respective  means  and  variances  (//i,  and  (//2  5 
and  respective  proportions  and  q2.  We  determine  the  threshold  T  that  results  in 
9i,  92,  hi, cfi,  (72.  They  minimize  the  Kullback  directed  divergence  [18]  J  from  the 
observed  histogram  T’(l), .  .  .  ,  P{I)  to  the  unknown  mixture  distribution  /,  where 

■J  =  Y1  =  X!  ^(*)  log  ^(*)  “  X  ^(*)  log  /(*) 

i=l  LG/  i=l  i=l 


and 


/(*)  = 


9i 


1  (^-Pl  \2 

3  2V 


?2 


1  (  ^-P2  \2 

D  2^  an  ' 


\/^ai  \/27rcr2 

Because  the  hrst  term  of  J  does  not  depend  on  the  unknown  parameters,  the  minimization 
can  be  done  by  minimizing  the  second  term.  Assume  that  the  modes  are  well  separated. 
Then  for  some  threshold  t  that  separates  the  two  modes 


m 


1  /t-li-i  ,2 

_21 _ 


\/2'Ka\ 
92 


i  <  t 


1  /  »-^2  12 

<^2  ’  ^  i  >  t 


\/Pw(72 

The  function  H{t)  to  be  minimized  can  then  be  written  as 

^(^)  = -X-P(^)log  /X  -  X!  ^(^log  /X 

i=i  y  It: a \  yZ7ra2 


1  (^-P2  \2 
D  2^  an  ' 
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Figure  4:  The  feature  point  grouping  algorithm. 

From  the  assumption  of  well-separated  modes,  the  mean  and  variance  estimated  from 
P(f),...P(f)  will  be  close  to  //i  and  ai,  and  the  same  for  the  second  part.  By  using 
these  estimated  values,  we  can  evaluate  H[t)  for  each  t.  We  choose  the  threshold  t  which 
minimizes  H[t). 

The  grouping  algorithm  is  illustrated  in  Figure  4.  In  Figure  5,  we  show  an  image 
overlaid  with  bounding  boxes  of  the  grouped  connected  components.  This  sample  image 
contains  2127  connected  components,  and  442  groups.  We  can  see  that  these  groups  are 
approximately  at  the  word  level.  Grouping  takes  less  than  10  seconds  per  image  when 
run  on  a  Sun  Ultra-Sparc  5  with  clock  speed  361.2  MHz. 

4.2  Feature  point  based  registration  algorithms 

With  feature  points  generated  by  the  methodology  described  in  Section  4.1,  we  need 
to  estimate  the  transformation  between  the  two  sets  of  feature  points.  In  this  section, 
we  discuss  several  registration  algorithms  that  can  be  used  for  this  purpose.  All  the 
algorithms  work  on  feature  points,  and  therefore  we  can  use  any  of  these  methods  for 
our  matching  problem.  The  algorithms  take  two  sets  of  feature  points  as  input,  and 
estimate  the  transformation  between  them.  We  also  need  to  give  the  bounds  for  the 
initial  search  space. 

4.2.1  Huttenlocher  et  al.’s  algorithm 

Huttenlocher  et  ah  [11,  12]  proposed  a  feature  matching  algorithm  using  the  Hausdorff 
distance  as  a  similarity  measure.  A  set  of  transformations  (a  cell)  is  dehned  such  that 
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Figure  5:  Sample  document  image  overlaid  with  the  bounding  boxes  of  the  grouped 
connected  components. 
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add  c  to  interesting  list  IL 

Create  a  new  L  with  smaller  cells  s.t.  they  completely  cover  IL. 

end 


where  Hlk{I R{R))  =  inax(/ii(/,  f  (if)), (if), /)), 
and  hK{t{R),I)  =  min,e7  ||  *  -  r  ||. 


Figure  6:  Huttenlocher  et  ah’s  algorithm. 


the  optimum  transformation  lies  inside  this  cell.  A  list  of  interesting  cells  is  created 
and  initialized  to  be  this  cell.  Let  i  be  the  original  points  and  if  be  the  transformed 
points.  For  each  cell  in  the  list,  determine  whether  it  is  possible  that  the  cell  contains  a 
transformation  t  for  which  R[R))  <  r,  where 

Hlk{IR{R))  =  max(/i7,(i,  f(if)),/i7^'(f(if),i)). 


and 

hK{t{R),  i)  =  K%{r)  mn  Ih  -  r  II  . 

If  the  rule  is  satished,  the  cell  is  marked  as  interesting.  Once  the  entire  list  has  been 
scanned,  a  new  list  of  smaller  cells  (of  the  same  size)  is  constructed  such  that  it  completely 
covers  the  interesting  cells.  This  step  is  repeated  until  the  cell  size  is  smaller  than  a 
threshold.  Figure  6  shows  the  pseudo-code  of  this  algorithm. 

4.2.2  Breuel’s  algorithm 

Breuel  [3]  proposed  a  registration  algorithm  called  RAST  (Recognition  using  Adaptive 
Subdivisions  of  Transformation  space).  We  dehne  a  box  to  be  a  set  of  transformations. 
Initially,  the  box  contains  all  the  transformations  we  would  like  to  consider.  The  algo¬ 
rithm  hnds  all  possible  correspondences  between  the  two  feature  point  sets  and  evaluates 
the  quality  of  the  match  resulting  from  this  set  of  correspondences. 

If  the  upper  bound  on  the  best  possible  match  is  either  (i)  smaller  than  the  required 
minimum  quality  or  (ii)  smaller  than  the  best  solution  found  so  far,  we  abandon  this 
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Input 

/:  Original  point  sets 
R:  Transformed  point  sets 
Output 

t:  Estimated  transformation 
SearchBox(box,  depth,  candidates) 
begin 

intersecting  =  all  candidates  that  intersect  box 
containing  =  all  candidates  that  contain  box 
axis  =  depth  mod  4 

if  evaluate(intersecting)  <  best_Quality  then  return 
else  if(candidates  =  containing)  or  (depth  >  max_Depth) 
then  best_Quality=evaluate(intersecting) 
best_Box  =  box 
return 

else  SearchBox(left (box, axis),  depth+f,  intersecting) 
SearchBox(right (box, axis),  depth+f,  intersecting) 

end 

RAST(constraints,  max_Depth,  min_Quality) 
begin 

best_Quality  =  min_Quality 
best_Box  =  none 

SearchBox(entireJbox,  0,  constraints) 
return  best_Box 


Figure  7:  Breuel’s  algorithm. 


part  of  the  transformation  space.  Otherwise,  we  subdivide  the  current  box  into  smaller 
regions  and  repeat  the  same  procedure  recursively.  This  process  terminates  when  all 
boxes  have  correspondences,  or  when  a  threshold  is  reached.  The  RAST  algorithm  is 
given  in  Figure  7. 

4.2.3  Mount  et  al.’s  algorithm 

Mount  et  al.  [20]  proposed  a  branch-and-bound  algorithm  for  feature  point  matching. 
They  used  the  partial  Hausdorff  distance  [If]  as  the  similarity  measure.  Given  point  sets 
A  and  B  and  parameter  k,  the  partial  Hausdorff  distance  is  dehned  as 

Hk{I,  R)  =  kj^jmmreRdist{i,r). 

Let  T  be  the  range  of  the  affine  transformation,  and  e  be  the  error  bound.  The  basic 
approach  of  the  branch-and-bound  algorithm  is  as  follows.  For  a  given  T,  we  hrst  compute 
the  upper  and  lower  bounds  on  similarity.  Next,  a  priority  queue  is  constructed  such  that 
the  element  that  has  the  largest  size  is  on  top  of  the  queue.  In  each  iteration,  we  pick 
up  the  largest  element  from  the  priority  queue  and  see  if  its  similarity  lower  bound  is 
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Input 

/:  Original  point  sets 
R:  Transformed  point  sets 
T :  Initial  search  space 

Output 

t:  Estimated  transformation 

begin 

construct  and  initialize  PQ  with  given  T 
while  PQ  size  ^  0  and  best  similarity  >  e 
do 

T  next  element  in  PQ 
compute  lower  bound  of  similarity  for  T 
if  lower  bound  of  T  >  best  similarity  -  e 
then  kill  this  cell  and  proceed  to  the  next  one 
compute  upper  bound  of  similarity  for  T 
if  upper  bound  of  T  <  bestsimilarity 
then  update  bestsimilarity  and  transformation 
split  T  into  Ti  and  T2 
insert  Ti  and  T2  into  PQ 
end 

end 


Figure  8:  Mount  et  al’s  algorithm. 


better  than  the  current  best  similarity.  If  not,  we  kill  that  element  and  proceed  to  the 
next  largest  element.  Otherwise,  we  compute  the  upper  bound  and  check  if  it  is  better 
than  the  current  best  similarity.  If  it  is,  we  (i)  update  the  best  similarity  to  be  the  upper 
bound  of  the  current  element,  (ii)  update  the  best  transformation,  (iii)  split  the  element 
into  two  parts  along  the  longest  side, and  (iv)  insert  both  new  elements  into  the  priority 
queue.  This  process  is  iterated  until  we  achieve  the  target  similarity  or  there  are  no 
more  elements  to  be  processed  in  the  queue.  In  computing  the  upper  and  lower  bounds 
of  a  given  range  of  transformation,  we  use  the  kd-tree-based  nearest  neighbor  searching 
algorithm  proposed  in  [2,  7].  The  matching  algorithm  is  illustrated  in  Figure  8. 

4.2.4  Hobby’s  algorithm 

Hobby  proposed  a  new  approach  to  the  registration  problem  [10].  In  this  algorithm,  he 
dehned  a  mismatch  function  and  found  the  minimum  values  using  direct  search  optimiza¬ 
tion  methods,  such  as  Nelder-Mead’s  [21]  and  Torczon’s  [24]  algorithms.  The  mismatch 
function  is  dehned  as  follows.  Let  R  be  the  real  image  with  connected  component  (7^, 
and  I  be  the  ideal  image  with  groundtruth  GR  Using  the  initial  afhne  transformation 
r®,  transform  to  Then  for  each  gj  £  we  can  choose  the  such 

that  the  distance  d[gl,c^P)  is  minimized.  Then  apply  a  standard  vector  norm  (he  used 
the  L4  norm)  to  the  resulting  list  of  d  values. 
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Input 

/:  Original  point  sets 
R:  Transformed  point  sets 
T^:  Initial  transformation  from  R  to  I 
Output 

6\  Estimated  transformation 


^  —  (j^xx:  ^xyi  ^yxi  ^yy 


:  tx^  fj/) 


d[g-,Cj):  Distance  measure  between  5/, 


begin 


Let  be  the  connected  components  of  R 

Let  be  the  groundtruth  of  I 

cf  e  e 

CBR  ^  e 

for  each  G  G^ 


<BR 


end 


for  each  4^  £  C 
compute  d[gl , 
ki  =  argminj  d^gj , 

Find  6  that  minimizes  the  function 

m  !,  G‘,  R)  =  ^T.,lea.{A3lAf)Y 


Figure  9:  Hobby’s  algorithm. 

The  distance  measure  d  is  dehned  as  follows.  Assume  that  we  have  two  boxes  A  and 
B.  The  distance  between  them  is  dehned  to  be 

d{A,B)=  mm{dj{A^i,  ^x2 :  ^xl :  -^a:2  )  :  ^y2 :  ^yl  i  ^y2  ) : 

Bx2,  Axl,  Ax2)  +  df{Byl,  By2,  Ayl,  Ay2)) 

+  dp(Aa;2  ~  Axl,  Bx2  —  Bxl)  +  dp{^Ay2  —  Ayl,  By2  —  Byl) 

where  the  x\  and  x2  subscripts  refer  to  a  box’s  minimum  and  maximum  x  coordinates 
and  the  y\  and  y2  subscripts  refer  to  a  box’s  minimum  and  maximum  y  coordinate,  dj 
and  dp  are  dehned  to  be 

{0  if  3:3  <  Xi  and  X2  <  ^4 

min(|  X3  -  xi  I,  I  X4  -  X2  I) 

+  max(0,  X2  —  Xi  —  (x4  —  x^})  otherwise 

dp[a,  b)  =  max(0,  max[a,  b)  —  8  min(a,  &)). 

He  used  four  corner  points  of  the  image  as  used  by  Kanungo  and  Haralick  [f  4]  to  estimate 
the  initial  affine  transformation  T®.  Figure  9  shows  his  algorithm.  More  details  about 
this  algorithm  and  the  distance  measure  can  be  found  in  [fO]. 


fO 


5  The  Impact  of  Pattern  Complexity  on  Image  Registration 


It  is  clear  that  the  performance  of  the  registration  algorithms  described  in  Section  4.2 
depends  on  the  number  of  feature  points  to  be  registered.  However,  the  complexity  of 
the  image  may  also  affect  algorithm  performance. 

In  this  section,  we  examine  the  impact  of  the  complexity  of  an  image  on  the  objective 
function  and  the  algorithm  performance.  For  all  the  experiments  described  in  this  section, 
we  hxed  the  number  of  points  in  each  image  to  be  500. 

5.1  Impact  on  objective  function 

Two  extreme  cases  are  considered,  one  with  an  asymmetric  image,  and  the  other  with  a 
highly  symmetric  image.  Figure  10  is  an  example  of  an  asymmetric  image.  This  image 
consists  of  500  data  points  on  8  line  segments;  most  of  the  lines  are  not  parallel  to  each 
other.  In  this  image,  the  gaps  between  points  on  the  line  segments  are  varying,  making 
the  line  segments  asymmetric.  For  this  data  set,  we  can  anticipate  that  the  objective 
function  should  converge  to  the  global  minimum  smoothly  (there  would  not  be  many 
local  minima).  To  show  the  six-dimensional  objective  function,  we  hxed  hve  parameters 
while  varying  one  parameter  around  the  optimal  solution. 

Figure  If  shows  the  impact  of  changes  in  the  hrst  four  parameters  of  the  afhne 
transformation.  The  impact  of  changes  in  the  two  translation  parameters  is  shown  in 
Figure  12.  As  we  anticipated,  there  are  very  few  local  minima,  making  it  faster  for  the 
algorithm  to  converge  to  the  global  minimum. 

Figure  13  is  an  example  of  a  symmetric  image.  This  image  consists  of  500  data 
points  with  50  parallel  line  segments.  In  this  case,  we  hx  the  gap  between  points  to 
be  constant.  For  this  image,  we  can  anticipate  that  there  are  many  local  minima  in 
the  objective  function,  because  if  we  translate  the  image  by  the  distance  between  the 
points/lines,  this  results  in  another  good  match  (even  though  not  as  good  as  the  global 
minimum).  Therefore,  the  objective  function  will  have  some  periodic  structure  in  the 
translation  direction.  Figure  15  shows  this  behavior  of  the  objective  function.  For  both 
X  and  y  translations,  there  are  periodic  local  minima  in  the  objective  function.  In  fact, 
the  periods  correspond  to  the  distances  between  points  in  the  two  directions.  For  the 
other  affine  parameters,  similar  behavior  is  observed,  causing  the  objective  function  to 
have  many  local  minima  as  the  parameters  change.  This  behavior  is  shown  in  Figure  14. 


5.2  Impact  on  algorithm  performance 

The  shape  of  the  objective  function  affects  the  performance  of  the  algorithm.  There 
are  several  algorithms  that  can  hnd  the  global  optimum  when  there  are  numerous  local 
minima.  However,  if  the  objective  function  has  many  local  minima,  these  algorithms 
have  difficulty  in  hnding  the  global  one.  When  we  ran  the  branch-and-bound  algorithm 
described  in  Section  4.2.3,  it  took  39  seconds  on  the  asymmetric  image  of  Figure  10,  and 
138  minutes  on  the  symmetric  image  of  Figure  13. 

Table  1  shows  the  running  times  of  the  branch-and-bound  algorithm  when  applied  to 
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Asymmetric  Image  (500  points) 


Figure  10:  Layout  of  asymmetric  image. 
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Figure  11:  Objective  function  of  asymmetric  image. 
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Y  translation 
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Figure  12:  Objective  function  of  asymmetric  image  (translation). 


Symmetric  Image  (500  points) 
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Figure  f3:  Layout  of  syimnetric  image. 
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Figure  14:  Objective  function  of  symmetric  image. 


Y  translation 
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Figure  15:  Objective  function  of  symmetric  image  (translation). 
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Image  type 

Number  of  lines 

Gap  type 

Running  time 

Asymmetric 

8 

Variable 

39  sec. 

Asymmetric 

8 

Constant 

51  sec. 

Symmetric 

8 

Variable/diff.  direction 

54  sec. 

Symmetric 

8 

Variable/same  direction 

98  sec. 

Symmetric 

50 

Variable 

68  min. 

Symmetric 

50 

Constant 

138  min. 

Table  1:  Timing  information  on  images  with  various  complexities. 


Asymmetric  Image  with  constant  gap 


Figure  16:  Asymmetric  image  with  constant  gaps. 


images  with  various  types  of  complexity.  Figures  16-18  show  the  layout  of  these  images. 
From  this  timing  information,  we  can  see  that  as  the  image  becomes  more  symmetric, 
the  running  time  for  registration  increases.  In  many  cases,  document  images  are  highly 
symmetric,  having  similar  layout  to  that  in  Figure  13.  This  fact  tells  us  that  registration 
of  document  images  usually  takes  more  time  than  for  more  asymmetric  images,  such  as 
satellite  images  and  video  images. 

6  Attributed  Point  Matching 

To  improve  algorithm  performance,  we  introduce  the  notion  of  attributes  of  feature  points 
into  the  similarity  measure.  Attributes  can  be  color,  area,  width,  height,  aspect  ratio,  or 
number  of  black  pixels.  The  similarity  measure  is  now  a  function  of  the  distance  between 
the  points  as  well  as  the  similarity  between  their  attributes.  We  use  the  number  of  black 
pixels  as  an  attribute  of  the  feature  points.  As  discussed  in  Chapter  4,  a  feature  point 
represents  a  group  of  connected  components.  Therefore,  we  can  count  the  number  of 
black  pixels  in  each  group  of  connected  components. 

Now  we  need  to  dehne  the  similarity  measure  for  the  attribute.  Let  Auj  be  the  differ- 
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Figure  17:  Symmetric  image  with  variable  gaps  (different  directions). 
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Figure  18:  Symmetric  image  with  variable  gaps  (same  direction). 
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ence  between  the  numbers  of  black  pixels  in  two  feature  points,  and  d  be  the  Euclidean 
distance  between  them.  Then  the  new  similarity  sirUa  is  dehned  to  be 

Sima  =p^exp(-^)  +  (1  -p)^exp(--^), 

Ai  Ai  A2  A2 

where  Ai  =  E[Ani,],  A2  =  E[d]  ,and  0  <  p  <  1. 

By  changing  p  we  can  control  the  weight  of  the  attribute.  For  example,  if  we  use  only  the 
distance,  we  can  set  p  to  be  0,  so  that  the  hrst  term  of  sim  is  0.  When  the  distance  is  0, 
the  similarity  is  also  0,  and  when  the  distance  goes  to  inhnity,  the  similarity  approaches  1. 

Instead  of  partial  Hausdorff  distance,  we  use  the  new  attributed  similarity  as  the 
similarity  measure  for  Mount  et  ah’s  algorithm  described  in  Section  4.2.3.  In  Figures  19- 
22  we  show  the  behavior  of  the  algorithm  for  the  two  similarity  measures.  The  image 
contains  30  randomly  generated  points.  We  then  remove  10%  of  the  points,  introduce 
the  same  number  of  outlier  points,  and  transform  the  image  with  a  5°  rotation  and  an  x 
translation  of  50.  The  running  time  for  partial  Hausdorff  distance  is  41  seconds,  whereas 
it  takes  26  seconds  for  attributed  similarity.  For  comparison,  we  multiply  the  attributed 
similarity  by  100  so  that  the  similarity  has  the  range  [1,100]  instead  of  [0,1].  Figure  19  is 
the  graph  of  best  similarity  at  each  iteration.  We  observe  that  the  attributed  similarity 
decreases  faster  than  the  partial  Hausdorff  distance. 

In  Figure  20  we  compare  the  maximum  size  of  the  cell  at  each  iteration  for  two 
similarity  measures.  The  attributed  measure  also  decreases  faster  in  this  case.  The 
number  of  active  cells  is  important  in  terms  of  system  resources.  The  maximum  number 
of  active  cells  represents  the  memory  usage  of  the  algorithm.  As  we  observe  in  Figure  21, 
the  maximum  number  of  active  cells  for  attributed  similarity  is  less  than  half  that  for 
partial  Hausdorff  distance.  Figure  22  shows  the  best  similarity  as  a  function  of  the  search 
tree  level.  We  observe  that  they  are  similar  to  each  other,  and  therefore  we  can  suppose 
that  in  both  cases  they  take  similar  paths  in  the  search  tree  to  reach  the  optimal  solution. 


7  Error  Metric  and  Experimental  Protocol 
7.1  Error  metric 

For  the  analysis  of  the  experimental  results,  we  need  to  dehne  an  error  criterion.  Let  G  be 
the  set  of  groundtruth  elements  =  1,  ■  ■  ■  ,  TV,,  where  N  is  the  number  of  characters  in 
the  image.  Typically,  pi  is  a  tuple:  pi  =  (x*,  y*,  Wi,  hi^  fi)  ^  Rx  R  X  R'^  X  R'^  X  JF,  where, 
Xj,  Pi  are  the  x-  and  ^-coordinates  of  the  upper-left  corner  of  the  character-level  bounding 
box,  rcj,  hi  are  the  width  and  height  of  that  bounding  box,  and  fi  is  the  font.  Let  6  and  6 
denote  the  true  and  estimated  transformations  respectively.  We  can  get  the  groundtruth 
for  the  rescanned  image  by  transforming  G  using  the  estimated  transformation.  Then 
we  can  dehne  G^  and  G^  to  be  the  set  of  transformed  groundtruth  elements  as  follows: 

G‘  =  T‘{G)  with  elements  gf  =  (4,  9“,  m",  ft", /") 

G*  =  r“(G)  with  elements  9“  = 

We  can  compute  p^  and  p^^  as  follows: 
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Best  similarity  vs.  number  of  iterations 
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Figure  19:  Best  similarity  vs.  number  of  iterations 
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Figure  20:  Maximum  cell  size  vs.  number  of  iterations 
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Number  of  active  cells  vs.  number  of  iterations 


Figure  21:  Number  of  active  cells  vs.  number  of  iterations 


Best  similarity  vs.  tree  level 


Figure  22:  Best  similarity  vs.  tree  level 
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{xlyfY  =  T\x,,y,)\{xly<^Y  =  T\x,,y,Y. 

To  define  w^,  h^-  and  let  nj,  Vi  be  the  x-  and  y-coordinates  of  the  lower-right  corner 

of  the  bonnding  box: 

Ui  =  Xi  Wi,  Vi  =  yi  -\-  hi 

{uY  vfY  =  vYY  (4:  ^^Y  =  V,Y 

wl  =  ul-  xf,  K  =  vl-  yf 

^  -wt  =  hi  =  vl  -  yY 

Also,  we  assnme  that  =  fi.  The  Enclidean  distance  between  the  centroids  of  the 

corresponding  bonnding  boxes  Si  is  defined  as 

Si  =  \\Centroid[gl),Centroid[gl)\\. 

Then  the  mean  and  maximnm  error  measnres  for  an  image  can  be  defined  as  follows: 

Pmean{G[,G^)  =  ^  E”=l 

Pmax{G\  G^)  =  maXij^i,  •  •  •  ^jv} 


7.2  Experimental  methodology  and  protocol 

Onr  experiment  was  performed  on  the  University  of  Washington  data  set  [22],  This  data 
set  contains  jonrnal  images  with  character-level  geometric  gronndtrnth.  We  performed 
two  experiments,  one  on  non-rotated  images  and  the  other  on  rotated  images. 

The  experiment  on  non-rotated  images  was  performed  on  450  images.  These  images 
were  generated  by  transforming  fO  randomly  selected  images  from  the  University  of 
Washington  data  set  by  45  different  transformations.  The  rotation  angle  R  was  set  at  zero 
and  the  scale  S  and  translation  Xj,  U  parameters  were  selected  from  the  following  sets: 

S  =  {65%,  80%,  100%,  120%,  135%}, 
w  =  {-50,0,50},  Yt  =  {-100,0,100}. 

The  initial  search  space  was  60%  ~  140%  for  scale,  —100  ~  100  for  X  translation,  and 
—200  ~  200  for  Y  translation. 

For  the  experiment  on  rotated  images,  we  generated  another  450  images  from  the  same 
10  images.  For  each  image,  we  have  45  different  transformations  described  as  follows: 
We  choose  the  scale  parameter  valne  from  the  set 

S  =  {65%,  80%,  100%,  120%,  135%}, 

rotation  from  the  set 

R  =  {0°,1°,3°}, 

and  the  X,  Y  translations  from  the  set 

(W,U)  =  {(0,0),  (50,0),  (100,0)}. 

The  initial  search  space  was  60%  ~  140%  for  scale,  —10°  ~  10°  for  rotation,  —100  ~  100 
for  X  translation  and  —200  ~  200  for  Y  translation. 


8  Results  and  Discussion 

In  this  section  we  describe  the  resnlts  of  onr  controlled  experiments.  We  nsed  the  branch- 
and-bonnd  method  described  in  Section  4.2.3  for  featnre  point  registration. 
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Mean  error  distribution 


Distance  between  bounding  boxes  (pixeis) 


Maximum  error  distribution 


Distance  between  bounding  boxes  (pixeis) 


Figure  23:  Distributions  of  mean  and  maximum  errors. 
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8.1  Experiments  on  non-rotated  images 

To  analyze  the  results,  we  generate  the  histogram  of  estimation  errors.  As  discussed  in 
Section  7.1,  we  calculate  Pmean{G^ iG^)  and  Pmax{G^ ^G^)  for  each  image  pair.  For  the 
set  of  images  O,  the  number  of  images  that  had  errors  in  the  range  A  is  counted.  The 
following  is  the  notation  for  this  analysis.  Let  O  be  the  set  of  images,  T  be  the  set  of 
transformations,  A  be  the  width  of  the  range,  I  be  the  set  of  transformed  images,  and  Q 
be  the  set  of  groundtruth  elements  Gi.  The  histograms  of  the  mean  and  maximum  error, 
Hmean{k]  O,  T,  A)  aud  H^ax{k]  O,  T,  A),  are  dehned  as  follows: 

=  ||{t  e  I  I  <  />™(G".Gf)  <  lAlAjll 

=  I|{!  €  I  I  TAA  <  p„,„(G",Gf)  < 

We  have  450  transformed  images  for  which  groundtruth  is  estimated.  The  histograms 
of  the  mean  and  maximum  error  distributions  of  this  image  set  are  shown  in  Figure  23. 
We  set  A  to  be  0.4  pixel. 

From  the  results,  we  see  that  the  estimated  groundtruth  is  close  to  the  true  ground- 
truth  with  less  than  3  pixels  of  mean  error  and  5  pixels  of  maximum  error.  The  mean 
of  the  mean  error  is  1.09  pixels,  and  the  mean  of  the  maximum  error  is  2.16  pixels.  The 
estimation  takes  10  ~  15  minutes  per  image  when  run  on  a  Sun  Ultra-Sparc  5  with  clock 
speed  361.2  MHz. 

8.2  Experiment  on  rotated  images 

The  same  methodology  as  that  for  the  non-rotated  images  was  used  for  the  experiment 
on  rotated  images.  Figures  24,  25,  and  26  are  the  distributions  of  mean  errors  for  the 
rotated  images. 

For  non-rotated  images,  we  have  a  similar  result  to  that  in  Section  8.1,  with  most 
of  the  mean  errors  less  than  3  pixels.  However,  for  the  images  rotated  by  1°,  the  mean 
errors  become  larger,  about  40  pixels,  and  for  3°  rotated  images,  the  average  of  the  mean 
errors  is  about  100  pixels. 

9  Application:  Registration  for  microfilmed  and  faxed  images 
9.1  Image  registration  for  microfilmed  images 

In  this  section  an  experiment  on  microhlmed  images  is  discussed.  Assume  that  we  are 
given  a  set  of  images  with  known  groundtruth,  and  corresponding  microhlmed  images. 
We  wish  to  generate  the  groundtruth  for  the  microhlmed  images  from  the  available 
groundtruth. 

In  general,  microhlmed  images  have  the  following  features: 

1.  Large  black  areas  around  the  image  (similar  to  photocopied  images) 

2.  A  lot  of  small  black  pixels  (so-called  salt-and-pepper  noise) 
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Mean  Error  Distribution 
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Distance  between  bounding  boxes  (pixels) 


Figure  24:  Distribution  of  mean  errors  for  0°  rotated  images. 
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Figure  25:  Distribution  of  mean  errors  for  f°  rotated  images. 
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Mean  Error  Distribution 
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Figure  26:  Distribution  of  mean  errors  for  3°  rotated  images. 


3.  Many  broken  characters 

4.  Many  merged  characters. 

Because  of  these  features,  it  is  helpful  to  hlter  out  the  connected  components  whose 
area  is  too  small  (case  2)  or  too  large  (case  1).  Also,  using  feature  point  grouping  as 
described  in  Section  4.f  helps,  especially  in  cases  3  and  4.  Consider  the  case  in  which 
many  characters  are  broken  apart  in  a  microhlmed  image  (see  Figures  27  and  28).  In 
many  cases,  these  broken  parts  are  still  very  close  to  each  other.  In  most  cases  the 
grouping  algorithm  regroups  them. 

Another  case  is  when  the  characters  are  joined.  In  this  case  we  have  relatively  large 
connected  components.  However,  the  grouped  result  will  be  similar  to  that  of  the  original 
image,  because  in  most  cases,  the  joint  characters  are  not  larger  than  words.  Therefore 
we  still  have  reasonable  feature  points  for  the  original  and  microhlmed  images.  This 
matters,  because  the  registration  algorithm  is  based  on  the  feature  points,  and  if  we  do 
not  provide  a  good  correspondence,  it  is  obvious  that  the  registration  algorithm  cannot 
give  us  a  good  result. 

Figures  29  and  30  are  corresponding  original  and  microhlmed  images.  Figure  31  is 
the  microhlmed  image  overlaid  with  the  estimated  groundtruth  information.  We  used 
the  methodology  discussed  in  Section  3;  the  groundtruth  is  at  the  word  and  zone  level  in 
DAFS  [8]  format.  The  registration  algorithm  of  Breuel  [3],  described  in  Section  4.2.2,  was 
used.  The  experiment  was  conducted  on  the  University  of  Washington  111  data  set  [22] 
with  978  images,  and  the  corresponding  microhlmed  images.  The  registration  took  about 
17  minutes  per  image  when  run  on  a  Sun  Ultra-Sparc  10  with  clock  speed  481.7  MHz. 
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Figure  14,  Mtmnsu  pudiai  plays  possum  when  touched.  Normally,  this  plant  stands  upri^l  (top).  But 
when  the  plant  is  touched,  an  action  potentUl  causes  the  touched  leaves  and  steins  to  dnwp  and  a[^ar 
dead.  The  fust  action  potential  takers  action  potentials  in  oBier  parts  of  ttte  plant,  and  these  stems  and 
leaves  also  droop;  soon  the  entiie  plant  appears  to  be  dead 


providing  a  secure  trap.  Then  nearby 
secretory  cells  exude  enzymes,  forming 
a  little  stomach  that  digests  the  insect. 

One  of  the  best-known  examples  of 
plant  behavior  comes  from  Mimosa  pu- 
dica,  often  called  the  sensitive  plant. 
When  the  leaves  of  the  plant  are 
touched,  they  bend  over  and  appear 
dead.  The  drooping  arises  from  a  me¬ 
chanically  driven  action  potential.  More¬ 
over,  an  action  potential  propagates 
from  the  stimulated  region  throughout 
the  plant.  This  causes  drooping  in  the 
rest  of  the  plant,  a  defense  mechanism 
apparently  designed  to  make  the  whole 
plant  look  unappealing. 

Not  all  plant  action  potentials,  how¬ 
ever,  cause  obvious  responses.  In 
Luffa — the  plant  whose  gouid  or  fruit  is 
us^  for  "loofah"  sponges — action  po¬ 
tentials  cause  a  transient  inhibition  of 
growth.  And  in  a  variety  of  flowers, 
pollen  landing  on  the  stigma  generates 
an  action  potential,  which  may  be  in¬ 
volved  in  subsequent  pollination  or  the 
maturation  process.  In  tomato  seed¬ 
lings,  a  mechanical  wound  induces 
electrical  activity  that  causes  the  accu¬ 
mulation  of  proteins  that  limit  further 
damage  to  the  plant. 

Electrical  phenomena  control  many 
responses  in  plants.  In  a  characean  alga, 
we  understand  many  of  the  details  of 
the  mechanism  that  leads  from  a  duck’s 
nip  on  the  plant  to  the  cessation  of  pro¬ 
toplasmic  streaming.  But  we  are  just  be¬ 
ginning  to  address  the  similarities  be¬ 
tween  the  electrical  excitability  in 
characean  algae  and  higher  plants,  let 
alone  animals.  In  any  case,  it  is  apparent 
that  plants  can  perform  long-distance 
communication  through  electrical  sig¬ 
nals,  such  as  the  passing  of  information 
from  a  mechanical  stimulus  from  one 
Mimosa  stem  to  another.  Many  biolo¬ 
gists  continue  to  describe  electrical  ex¬ 
citability  as  part  of  the  animal  world.  In 
the  future,  we  should  think  of  plants  as 
excitable  too. 
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Figure  27:  Original  image. 


Figure  28:  Microfilmed  image  with  broken  connected  components. 
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other  adaptations.  Hence,  there  is 
no  more  reason  to  believe  that  the 
brain  is  a  tabula  rasa  than  to 
believe  that  the  stomach  is  a  gener¬ 
al  digester  designed  to  track  the 
foods  an  organism  may  encounter. 

Differences  in  research  strategies 

In  its  pure  form,  DA  focuses  on 
differences  in  LRS  between  individ¬ 
uals  encountering  different  en¬ 
vironments,  and  uses  the  methods 
of  behavioural  ecology  to  study 
these  differences.  EP.  in  its  purest 
form,  uses  the  methods  of  evol¬ 
utionary  biology  and  experimental 
psychology  to  study  the  naturally 
selected  design  of  psychological 
mechanisms.  Consider  how  these 
two  types  of  researcher  might 
approach  testing  the  Trivers- 
Willard”  hypothesis  about  the  allo¬ 
cation  of  parental  investment  to 
male  and  female  progeny. 

Trivers  and  Willard  argued  that  if 
III  variance  of  male  LRS  exceeded 
that  of  female  LRS,  (2)  the  relative 
health  and  dominance  of  mothers  is 
passed  on  to  their  progeny,  and  (3) 
healthy  or  dominant  males  obtain 
more  matings  than  males  lacking 
these  attributes,  then  (4)  females 
will  be  selected  to  allocate  invest¬ 
ment  in  progeny  as  a  function  of 
their  health  or  dominance.  Clutton- 
Brock  et  a/.'\  in  a  comprehensive 
study  of  red  deer  {Cervus  elaphus). 
found  considerable  support  for  the 
hypothesis.  Sons  born  to  mothers 
above  median  rank  were  more 
reproductively  successful  than  their 
daughters,  while  daughters  born  to 
subordinate  mothers  were  more 
reproductively  successful  than  their 
sons.  Moreover,  the  ratio  of  sons  to 
daughters  produced  by  dominant 
mothers  was  higher  than  for  subor¬ 
dinate  mothers.  Because  the  sex 
ratio  and  reproductive  success 
were  key  dependent  variables  in 
this  study,  it  is  similar  to  some 
studies  of  sex  allocation  done  by 
DAs  and  described  by  Sieff’. 

An  evolutionary,  psychologist 
attempting  to  test  the  Trivers- 
Willard  hypothesis  would  first  con¬ 
struct  a  selection  model  relating 
sexual  dimorphism  in  variance  in 
reproductive  success  in  males  and 
temales  and  health  or  status  of 
mother  to  the  benefits  of  differen¬ 
tial  investment  in  sons  and  daugh¬ 
ters'”.  Varying  the  parameters  of 
the  model  would  provide  a  des- 


Current  behaviour 

Fig.  2.  The  evolutionary  psychologist's  perspective  on  how  the  evolved  innate  adaptation  in  conjunction 
with  the  current  developmental  and  immediate  environments  produces  current  behaviour.  Because 
there  is  a  clear  distinction  between  ancestral  and  current  environments  and  between  ancestral  and  cur¬ 
rent  operational  adaptations  lalthough  not  between  ancestral  and  current  innate  adaptations)  ancestral 
and  current  behaviour  may  differ  considerably.  Although  ancestral  behaviour  contributed  to  ancestral 
fitness,  and  hence  the  evolution  of  the  innate  adaptation,  current  behaviour  need  not  contribute  to  cur- 


cription  of  how  sex  allocation  might 
have  been  se/ecfed for  in  a  particu¬ 
lar  species.  The  model  would  be 
used  in  conjunction  with  infor¬ 
mation  about  the  natural  history  of 
the  species  to  explore  the  param¬ 
eter  space  of  the  independent  vari¬ 
ables  to  determine  whether  a  'win¬ 
dow'  of  opportunity  could  have 
existed  for  the  evolution  of  the 
putative  adaptation.  If  the  results 
of  the  modelling  suggested  that  the 
evolution  of  the  adaptation  is 
plausible,  a  theory  of  the  nature  of 
the  adaptation,  specified  in  terms 
of  decision  rules  assumed  to  be 
instantiated  in  neural  hardware, 
would  be  formulated.  The  depen¬ 
dent  variables  would  be  outputs 
from  the  decision  process  affecting 
nursing  time,  amount  of  protection 
from  predators,  etc.,  given  to  sons 
and  daughters,  rather  than  fitness 
measures  or  behaviours  assumed 
to  enhance  fitness.  Attitudes,  val¬ 


ues,  intentions  and  motives  would 
be  measured  in  human  studies.  A 
decision  rule  might  be  something 
like:  'If  subordinate  and  physically 
weak,  be  more  responsive  to  the 
needs  of  daughters  than  of  sons, 
but  if  strong  and  dominant  be 
more  attentive  to  the  needs  of 
sons  than  of  daughters’.  It  would  be 
necessary  to  formulate  a  theory  of 
the  relation  between  ancestral  and 
current  environments. 

Such  a  theory  requires  a  model 
of  how  the  crucial  independent 
variables,  which  are  measures  of 
adaptation-relevant  external  and 
internal  environmental  variables, 
are  represented  to  the  ancestral 
adaptation.  Dominance,  for  exam¬ 
ple,  might  have  been  represented 
in  terms  of  posture,  frequency  of 
unreciprocated  threat  displays,  or 
resources  held  by  different  ances¬ 
tral  individuals.  Once  the  decision 
rules  that  describe  the  adaptation 
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Figure  29:  Original  image  to  be  registered. 
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other  adaptations.  Hence,  there  is 
no  more  reason  to  believe  that  the 
brain  is  a  tabula  rata  than  to 
believe  that  the  stomach  is  a  genei* 
al  digester  designed  to  track  the 
foods  art  organism  may  encounter 

DiffcrencM  In  research  strategies 

In  its  pure  form,  DA  focuses  on 
differences  In  LRS  between  individ¬ 
uals  encountering  different  en¬ 
vironments.  and  uses  the  methods 
of  behavioural  ecology  to  study 
these  differences,  EP,  in  Its  purest 
form,  uses  the  methods  of  evol¬ 
utionary  biology  and  experimental 
psychology  to  study  the  naturally 
selected  design  of  psychological 
mechanisms  Consider  how  these 
two  types  of  researcher  might 
approach  tesiirtg  the  Trivers- 
Willard"  hypothesis  about  the  allo¬ 
cation  of  parental  investment  to 
male  and  female  progeny. 

Trivers  and  Willard  argued  that  if 
HI  variance  of  male  LRS  exceeded 
(hat  of  female  LRS,  (2)  the  relative 
health  and  dominance  of  mothers  is 
passed  on  to  their  progeny,  and  13) 
healthy  or  dominant  males  obtain 
more  matings  than  males  lacking 
these  attributes,  then  <4)  females 
will  be  selected  to  allocate  invest¬ 
ment  in  progeny  as  a  funalon  of 
their  health  or  dominance.  Clutton- 
Brock  er  al.“,  in  a  comprehensive 
study  of  red  deer  (Cervus  elaphus\. 
found  considerable  support  for  the 
hypothesis.  Sons  bom  to  mothers 
above  median  rank  were  more 
reproductlvely  successful  than  their 
daughters,  while  daughters  bom  to 
subordinate  mothers  were  more 
reproduaively  successful  than  their 
sons.  Moreover,  the  ratio  of  sons  to 
daughters  produced  by  dominant 
mothers  was  higher  than  lor  subor- 
dlrtate  mothers.  Because  the  sex 
ratio  and  reproductive  success 
were  key  dependent  variables  in 
this  study,  it  is  similar  to  some 
studies  of  sex  allocation  done  by 
DAs  and  described  by  Sleff’. 

An  evolutionary,  psycholo^st 
attemptftsg  to  test  (he  Triveis- 
Willard  hypothesis  would  first  con¬ 
struct  a  selection  model  relating 
sexual  dimorphism  in  variance  in 
reproduaive  success  in  males  artd 
females  and  health  or  status  of 
mother  to  the  beneflts  of  differen¬ 
tial  irtvestment  in  sons  aird  daugh¬ 
ters'*.  Varying  the  parameters  of 
dte  model  would  [Kovide  a  des¬ 
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measures  or  behaviouR  assumed 
to  enhance  fitness.  Attitudes,  val¬ 


ues,  intentions  and  motives  would 
be  measured  in  human  studies.  A 
decision  rule  might  be  something 
like:  'If  subordinate  artd  physically 
weak,  be  more  responsive  to  the 
needs  of  dau^ten  than  of  sons: 
but  if  strong  and  dominant  be 
more  attentive  to  the  needs  of 
sons  than  of  daughtere'.  It  would  be 
necessary  to  formulate  a  theory  of 
the  relation  between  ancestral  and 
current  environments. 

Such  a  theory  requires  a  model 
of  how  the  crucial  independent 
variables,  which  are  measures  of 
adaption-relevant  external  and 
Inremal  environmenul  veritrijles. 
are  repsertted  to  the  ancestral 
adapution.  DominarKe,  for  etam- 
(rie.  might  have  been  represented 
in  terms  of  posture,  frequency  of 
unreciprocat^  threat  displays,  or 
resources  held  by  different  ances¬ 
tral  Individuals.  Once  the  dedston 
rules  that  describe  rite  adaptadon 


Figure  30:  Microfilmed  image  to  be  registered. 
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Figure  31:  Microfilmed  image  overlaid  with  estimated  groundtruth. 
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9.2  Experiment  on  a  faxed  image 

In  Section  9.1,  we  discnssed  a  methodology  for  generating  gronndtrnth  for  microfilmed 
images.  The  same  methodology  can  be  applied  to  other  images  snch  as  photocopied  or 
faxed  images.  We  faxed  and  rescanned  an  image,  and  ran  the  featnre  point  registration 
algorithm  to  prodnce  the  gronndtrnth  for  this  image.  Fignre  32  shows  the  faxed  image 
overlaid  with  the  estimated  gronndtrnth. 

10  Conclusions 

We  have  proposed  an  improvement  over  the  antomatic  gronndtrnthing  algorithm  pro¬ 
posed  by  Kannngo  and  Haralick.  We  nsed  featnre  point  gronping  to  rednce  the  complex¬ 
ity  of  the  problem.  Then  we  nsed  featnre  point  registration  algorithms  on  the  gronped 
featnre  point  sets  to  estimate  the  transformation  between  two  images.  To  analyze  the 
resnlt  of  a  controlled  experiment,  we  dehned  the  error  metric  to  be  the  Enclidean  distance 
between  the  centroids  of  corresponding  characters.  Farther  redaction  in  gronndtrnth  lo¬ 
cation  error  can  be  achieved  by  nsing  the  local  template  matching  algorithm  described 
by  Kannngo  and  Haralick  [13,  14]. 

The  contribntions  of  this  paper  are: 

•  We  made  the  image  registration  process  more  robnst  by  nsing  all  the  featnre  points 
available  from  both  the  original  and  transformed  images.  Several  point  matching 
algorithms  were  discnssed  and  nsed  for  docnment  image  registration. 

•  We  stndied  the  impact  of  pattern  complexity  on  the  registration  process.  By  ob¬ 
serving  the  behavior  of  the  objective  fnnction,  we  fonnd  that  registration  takes 
more  time  on  symmetric  images  than  on  asymmetric  ones. 

•  We  also  stndied  attribnted  point  matching.  Each  featnre  point  can  have  an  at- 
tribnte,  snch  as  color,  area,  width,  height,  aspect  ratio,  or  nnmber  of  black  pixels. 
This  attribnte  can  be  introdnced  into  the  similarity  measnre  to  make  registration 
faster  and  more  accnrate.  We  nsed  the  nnmber  of  black  pixels  as  an  attribnte,  and 
fonnd  the  best  similarity  and  maximnm  cell  size  at  each  iteration,  as  well  as  the 
nnmber  of  active  cells  at  each  iteration. 

•  We  nsed  onr  algorithm  to  create  gronndtrnth  for  scanned  microhlm  images  and 
faxed  images. 
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Figure  32:  Estimated  groundtruth  of  faxed  image. 
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